Scaling Spark in the Real World: Performance and Usability
نویسندگان
چکیده
Apache Spark is one of the most widely used open source processing engines for big data, with rich language-integrated APIs and a wide range of libraries. Over the past two years, our group has worked to deploy Spark to a wide range of organizations through consulting relationships as well as our hosted service, Databricks. We describe the main challenges and requirements that appeared in taking Spark to a wide set of users, and usability and performance improvements we have made to the engine in response.
منابع مشابه
بررسی کاربردپذیری سیستم اطلاعات رادیولوژی
Introduction: One of the health information systems used in health care settings is Radiology Information System. This system can increase the quality and accuracy of work processes in the radiology department and can reduce the number of human resources required to archive images as well as the hospital costs, and, finally, can lower the retrieval time of archived images. Lack of usability of ...
متن کامل-
In this paper we start with Meyer-Ter-Vehn isobaric fusion model and try to reconstruct all equations by introducing a dimensionless variable ?i=ri/Rm. Then we investigate the proper sets of spark confinement parameter and temperature {Hs,Ts} which satisfy ignition conditions of spark ignition in deuterium-tritium (DT) equimolar mixture in terms of isentrope parameter, ?, implosion velocity, Ui...
متن کاملA Detailed Exploration of Usability Statistics and Application Rating Based on Wireless Protocols
A Detailed Exploration of usability statistics and Application Rating on short-range Wireless protocols Bluetooth (IEEE 802.15.1), ZigBee (IEEE 802.15.4), Wi-Fi (IEEE 802.11) and NFC (ISO/IEC 14443) has been performed that being representing of those prominent wireless protocols evaluating their main characteristics and performances in terms of some metric such as co-existence, data rate, secur...
متن کاملSpark Scalability Analysis in a Scientific Workflow
Spark is being successfully used for big data parallel processing in many business domains (social media, finance, retail). Spark’s scalability, usability, and large user community have motivated developers from scientific domains (bioinformatics, oil and gas, astronomy) to try it. However, scientific applications’ profile, e.g., black-box programs and intense file writes, differs from traditio...
متن کاملInvestigating the usability of an Integrated Research Automation System (SEAT): Heuristic Evaluation
Background and Objectives: Today, many hardware and software products, including office automation software, and web-based websites are used by employees, including professors and employees of different departments in offices. Websites are considered one of the main aspects of competition in any organization. This study aims to investigate the usability of the Integrated Research Automation Sys...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- PVLDB
دوره 8 شماره
صفحات -
تاریخ انتشار 2015